Text Classification Based On Manifold Semi- Supervised Support Vector Machine

نویسندگان

  • Vo Duy Thanh
  • Vo Trung Hung
  • Pham Minh Tuan
  • Ho Khac Hung
چکیده

This article presents a solution along with experimental results for an application of semi-supervised machine learning techniques and improvement on the SVM (Support Vector Machine) based on geodesic model to build text classification applications for Vietnamese language. The objective here is to improve the semi-supervised machine learning by replacing the kernel function of SVM using geodesic distance algorithm. This experiment is implemented on five data layers which are extracted from documents in five topics sports, education, law, international and society news on dantri.com.vn. The experiment compares the results’ accuracies with and without the improved features for SVM semi-supervised machine learning through geodesic distance. This proposed model called manifold semi-supervised machine learning shows significant improvement both in quality and stativity over the pure SVM algorithm. Keywords-text classification, support vector machine (SVM), semi-supervised learning, manifold learning, geodesic model

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran

Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...

متن کامل

Improved Nearest Neighbor Methods For Text Classification With Language Modeling and Harmonic Functions

We present new nearest neighbor methods for text classification and an evaluation of these methods against the existing nearest neighbor methods as well as other well-known text classification algorithms. Inspired by the language modeling approach to information retrieval, we show improvements in k-nearest neighbor (kNN) classification by replacing the classical cosine similarity with a KL dive...

متن کامل

Improved Nearest Neighbor Methods For Text Classification

We present new nearest neighbor methods for text classification and an evaluation of these methods against the existing nearest neighbor methods as well as other well-known text classification algorithms. Inspired by the language modeling approach to information retrieval, we show improvements in k-nearest neighbor (kNN) classification by replacing the classical cosine similarity with a KL dive...

متن کامل

Network Video Online Semi-supervised Classification Algorithm Based on Multiple View Co-training

As information integration based on multiple modal has to problems like complexity calculation process and low classification accuracy towards network video classification algorithm, came up with a network video online semi-supervised classification algorithm based on multiple view co-training. According to extract the features in text view and visual view, to the feature vector in each view, u...

متن کامل

Linear Manifold Regularization for Large Scale Semi-supervised Learning

The enormous wealth of unlabeled data in many applications of machine learning is beginning to pose challenges to the designers of semi-supervised learning methods. We are interested in developing linear classification algorithms to efficiently learn from massive partially labeled datasets. In this paper, we propose Linear Laplacian Support Vector Machines and Linear Laplacian Regularized Least...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014